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SUMMER FACULTY REPORT 

System Safety Engineering, CT - 21, Marshall Space Flight Center 
Stephen J. Morrissey, Ph.D., Summer, 1990 

SUMMARY 

This report summarizes the findings of the problems I was asked 
to address during my stay. There were five basic problem or 
question areas. Four of the five are examined individually in the 
following pages, the fifth is was to provide recommendations, 
these are included with each of the four major problem areas. 


1 : EVALUATE ADEQUACY OF CURRENT PROBLEM/ PERFORMANCE DATA BASE 
Problem and performance identification and evaluation is defined 
by PRACA requirements, with each of the major system contractors 
having their own contractual arrangements which are also based on 
PRACA. Under this system, reports of problems or unusual and 
unexpected events or conditions come from the contractor, from 
acceptance/qualification testing, in-flight and post flight 
analyses ( PFA ) . When problem report data is received it is 
evaluated for its criticality and uniqueness. If an observed 
problem is deemed to meet these requirements, it is entered as a 
problem report and enters the reporting and evaluation procedure. 

1 . Data Acquisition : Calspan is automating the data reporting 

and trending efforts. This is a relatively new project and will 
allow for identification of trends in data for established 
(previously identified) problems and for classification of 
newly reported problems or unusual events that do not have 
FMEA-CIL numbers. This effort uses the traditional data sources 
described above and is working to also integrate data from 
contractor's internal data bases. 

Comments : 

*This effort can be improved by more contractor cooperation in 
sharing their data bases, by inclusion of data derived from 
standard repair procedures, and by improving the communication 
between MSFC and contractors at other locations. Several 
contractors are providing excellent data from their internal 
data bases. 

♦Because of the importance of understanding the PFA problem 
identification methods, it is recommended that responsible 
individuals participate in these sessions at least once a year. 
This can be by live participation or by viewing the video 
taped activites. In a similiar fashion, individuals who are 
required to evaluate other systems should have exposure to the 
contractors actual operations. This will enhance their 
understanding of the physical hardware and how it is prepared, 
tested and evaluated. 

♦It has been observed that different divisions within SRM&QA are 
working on projects or have contractor projects that are of 
great interest and use to other sections. Knowledge of other 
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divisions projects is not totally comprehensive, and while 
there is no evidence of attempts to prevent other groups and 
divisions from knowing what is being done, communication is not 
as complete as it should be. It is recommended that some method 
of cross communication be established, perhaps a session in 
which each group leader gives a short presentation which 
outlines active projects and their groups current and future 
operational needs. From these discussions better understanding 
of each groups capabilities and needs will be possible, and a 
more coordinated effort result. 

♦The post flight analysis is primarly done by engineering 
personnel, and considerable experience has been gained. This 
experience base is now sufficiently developed so that this 
activity can be taken over by quality oriented rather than 
design oriented personnel. Such a change in orientation should 
improve the reviews if the knowledge and experience base gained 
in PFA inspections can be translated into inspection criteria 
possibly with an expert system. Such a shift in orientation 
should also allow for a better understanding and control of the 
variety of design waivers that exsist on any system. This 
inclusion of quality personnel should also facilitate problem 
q uan i t i f i c a t i on and hazard analyses. 

2 . Data Evaluation-Reporting : Calspan develops a monthly report 

called the Open Problems List {OPL) that lists and trends 
problem reports that have been filed or closed during the 
reporting interval. These reports are grouped by system (ET, 
SRB, SRM), and whether the reports have been closed or are 
still open at the end of the reporting interval. The OPL is 
distributed to a variety of users to help managers and other 
personnel identify problems in the various systems. 

An important issue has been how to deal with problem reports 
and hazards that do not have FMEA-CIL numbers or that are new 
or unique. This has been resolved by assigning problem reports 
without FMEA/CIL numbers a citicality of one. These reports are 
then grouped together and in a review session evaluated as to 
their apparent criticality, and assigned to project groups to 
develop FMEA-CIL documentation. This procedure should allow for 
rapid identification and entry of new hazards into the FMEA/CIL 
data base. In the last review of this type, 580 reports did not 
have FMEA/CIL numbers. Of these, 440 were considered as "non- 
problems", the remaining 140 have been assigned to project 
groups to develop FMEA analyses and related CILs. The amount of 
active participation by system safety in these reviews is not 
clear. 

Comments : 

*A better method is needed to trace reported problems to their 
basic or root cause(s) so that proper counter measures or 
corrective designs can be developed. This includes problems 
that result from devi c e / sys tern failures, to problems that 
result from devices being out of tolerance, but still 
functioning, and human errors. This may also require better 
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reporting of problems arising from standard repair procedures. 
While this type of reporting is a sensitive issue to some 
contractors, this data is needed to insure proper tracking of 
problems. It can be argued that to provide this data would 
require substantial additional reporting and accounting efforts 
by contractors. However, this data should already be internally 
available to the contractors. A possible compromise would be 
to have contractors supply their own measure or ratio of units 
accepted(of some particular type) to units sent (of some 
particular type) . 

*A computerized, real time system to cross reference Hazard 
Reports, CIL Numbers, FMEA Numbers and problem report numbers 
and any waivers or proposed engineering changes needs to be 
imp 1 ement ed . 

*The current system for problem reporting developed by Calspan 
has great protential. It should be expanded to allow the 
f ol lowing : 

1. Interactive searching of the data base by non-Calspan 
personnel. this is a relatively new program, this may 
develop naturally with maturity. 

2. A survey needs to be made of SRM&QA personnel to 
determine what other information would be useful for 
presentation in the monthly OPL report. 


2 ■ EVALUATE METHODS OF PERFORMING TREND ANALYSIS 

Currently there are two major efforts under way that involve 
trend analysis. These are Performance Trending and Problem 
Trending . 

Performance Trending: Data from past launches is being used by 
ATI to develop envelopes of nominal performance. Real time data 
for a particular system or element across its operational time 
is statistically evaluated to develop templates or control 
chart limits describing the upper and lower values for the 
parameter over time that have been observed 957« of the time. 
These upper and lower limits provide a window within which real 
time values of the parameter can be plotted allowing 
determination of the "acceptability" of the parameter compared 
to past performance at that point in operational time. 


Comments : 

★ This method has great usefulness for both real time LCC 
decisions and for development of test/acceptance criteria. 

★ This method should be expanded to allow development of 

multivariate plots, not just individual (sub)system responses. 

★The performance envelopes developed by this method may be quite 
different from the "red-line" values. A method of quantifying 
the risk associated with observations in this region needs to 
be developed. 
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F r ob 1 em Trending : is being addressed in two different ways. The 

first is the OPL report discussed earlier that lists problems 
by system and criticality, and corrective actions (if any) that 
have been taken in the past reporting period (typically one 
month). The second effort is the "problem trending report" that 
is issued every six months. This report uses all available 
current and historical data for systems, elements, and 
subsystems, and develops a variety of different trend analyses 
for these data. Analyses typically are trend reports for 
systems and elements, with detailed studies performed on 
various elements or systems according to frequency or 

criticality of events, and visibility. These trend analyses 
use graphical and statistical methods, and can be used to 
describe the effectiveness of design changes, or point out 
areas needing control. The purpose of these efforts is to 
provide management guidance and oversight to managers, and 

facilitate tracking of problems and the effectiveness of 
correction . 

♦Trending is a powerful tool to facilitate understanding and 
control of the systems described. However the feedback loop to 
insure compliance is not always present. In the trending 
reports there is evidence of systems or components in which 
design changes have been made, yet the rate of problems has not 
changed, and in some cases, the problem rate has increased. 
Trend analysis is only as good as managers choose it to be. 


3 : METHODS AND SOURCES OF DATA FOR PROBABILISTIC RISK ASSESSMENT 

There have been substantial efforts in the area of PRA to 
determine basic and time dependent reliabilities for elements, 
systems and subsystems, and towards life cycle characterizations. 
Data used for these evaluations comes from the basic sources of 
data already identified. These analyses have been oriented 
towards traditional reliability studies, and their system safety 
impact or inputs are not totally characterized, nor have system 
safety inputs been sought in any systematic fashion. 

Comments : 

♦Facilities and personnel are available to perform PRA at many 
different levels of complexity and sophistication. However this 
resource is not sufficiently recognized nor utilized as a 
source to develop PRA criteria for FMEA/CILs or other types of 
hazard analysis. 


4_L HOW IS RISK ASSESSMENT DOCUMENTATION UPCRADED/UPDATED? 

This is currently performed by problem review boards or by 
individuals raising concerns and initiating these changes and 
modifications. Until the trending analysis efforts had been 
developed, this was the only way by which hazards and needed 
revisions could be identified. The updating of documentation is a 
different issue and hazard reports and CILs may often have 
waivers and inprocess modifications and engineering change 
proposals active. Keeping track of these is difficult, and 
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currently is performed as much by word of mouth as by formal 
lines of communication. 

*This condition obviously needs to be improved, possibly by 
using a more formalized procedure which would flag hazard 
reports, FMEA-CILs whenever a waiver or engineering trend 
proposal that references them is active. 
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